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STATINTL 


For Official Use Only 


A MODEL FOR FORECASTING SOVIET GRAIN YIELDS 


INTRODUCTION 


1. For many years, this Office has assessed the progress of the Soviet grain 
crop during the growing season. Independert assessments are necessary mainly 
because the USSR Ministry of Agriculture provides only the most geueral situation 
reports during the growing scason. Crop results are reported several months after 
the completion of the harvest. The tautness of the world fuod supply and the con- 
sequences of the wide swings in Sovict grain production heve increased the need 
fur timely forecasts of Sovict grain output. 


2. This publication documents the development of a model to predict grain 
yields in 33 crop regions (see the map). The predicted yields—based on time 
trends and a composite index of several weather variables—are combined with 
reported data on sown area to obtain crop estimates. With the help of the model, 
crop estimates can be made as carly as April and revised to take account of 
additional information until the harvest is completed. 


3. The model's structure, computer programs, and weather data base can be 
applied to any crop for which yield data are available. So far, the model has been 
used to forecast yiclds for all grain, winter wheat, and spring wheat.! To estimate 
the Soviet grain crop, estimation of sown area and adjustments to reflect collateral 
information are also used. 


PRINCIPAL FINDINGS 


4. A weather-yield model has proved to be useful in estimating Soviet crop 
yields. The predicted total 1974 grain harvest was 2% above the actual harvest. 
The model can produce reasonably reliable estimates early in the season and can 
then be revised every ten days as new weather data are received.* 


5. Data on production, sown area, and yields in 1958-73 have been coliccted 
for all grain, winter wheat, spring wheat, winter rye, and spring barley. A weather 
data base covering the period from 1961 to the present has also been assembled 
for uce with the model. Relying on this data base, the model emplcys a weather 
index—representing the influence of temperature and precipitation—and a time 
trend—yrepresenting technological change—as explanatory variables in a set of 
independent linear equations that predict crop yields. 


Nore: Comments and queries regarding this publication are welcomed. They may be 
directed to f the Office of Economic Research, Code 143, 
Extension 

‘In this publication, the official Sovict definition of all grain is used—including wheat, 
rye, barley, oats, corn, rice, millet, buckwheat, and pulses. 

‘This model was developed in early 1973 and formed the basis of CIA estimates for 
that year. The model was modified slightly in 1974. This publication is limited to the current 
model, and all data refer to it unless otherwise stated. 
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Approved FobRéleasd 2pnqoeres sOLauRDdPsérosedsRooesoczs0bbs-bhe USSR 
has announced a total grain harvest of 195.5 million metric tons and two rep tblic 
harvests—J 11.6 million tons for the RSFSR and 46 -nillion tons for the Uk-aine. 
The model's final prediction (end of August) for the USSR was 199.4 milliov. ions, 
within 2% of the total announced in December. In September, Politburo “hair- 
man Brezhnev said the crop would be “not bad” but admitted that areas in, ziberia 
and Kazakhstan were having trouble. The announcement of a crop of 195.5 million 
tons confirms these indications. 


7. The prediction for 1974 also seems reasonahle in terms of its dev-lopment 
over time and its forecast of regional trends. The prediction increase: sharply 
during May and June, when the weather was favorable, and then declined as con- 
ditions deteriorated. The final predictions for the RSFSR and the Ukrome (115.3 
million tons and 47.9 millions tons, respectively) are siightly higher then the pnb- 
lished totals (3% and 4% ). The prediction for the remaining portion ¢° the USSR 
is 4% below the actual harvest. 


8. Although the model describes the historical data quite well, prediction 
errors can be significant. Tests show that an error of 2% to 8% in ‘ine predicted 
national yield can be expected. An error in estimates of the sown 4 ea can either 
compound or mitigate the effect of errors in predicted yields. 


FORMULATION OF THE MODEL 
Factors Influencing Yields 


Weather 


9. Although the importance of weather in determining grain yields is ap- 
parent, the causal relationships are difficult te define. Many individual weather 
variables affect the final yield. Moisture, for example, influences the number of 
plants per hectare, the number of stalks per plant, leaf area, the number of heads 
per plant, the number of kernels per head, and the weight per kernel. Temperature 
variations speed up or curb growth and change the water balance in the plant. 
Extreme temperatures may injure or kill the plant. 


10. Other weather variables that influence yields incl de sunlight, humidity, 
haii, and wind. In particular, the hot sukhovey (dry wind«. prevalent in the east- 
ern grain-growing regions of the USSR greatly increase evaporation and trans- 
piration, and at times even beat down a crop. Weather v 1ables also affect yields 
indirectly because of the importance of good weather ‘iu } eldwork at planting and 
harvest times and because of their role ir, encouraging o- retarding plant diseases, 
parasites, and weeds. 


11. Specitic patterns of interaction between weather variables and the physio- 
logical reguirements of plants (and hence yields) vary considerably according 
to the peculiarities of local climates. Nonetheless, certain common factors are 
important. In the early stages of growth—at least . arough tillering—grain requires 
a continuous supply of moisture in the upper layer of the soil. After tillering, which 
normally occurs a little less than 30 days after sowing. the period of rapid vegeta- 
tive growth occurs. As stems and leaves develop, consumption of water by the 
plants increases greatly. The heading stage, which is critical to grain yields, usually 


* The pre-lictions represent gross grain production—the grain obtained from the harvesting 
machine in the field, including excess moisture, unripe and damaged kernels, weed seeds, 
and the losses in nandling and transporting the grain between the field and storage facilities. 
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occurs a little more than 30 days after tillering.’ The relatively high temperatures 
that generally prevail in the USSR during this stage result in increased transpira- 
tion and ir the more rapid depletion of soi] moisture by evaporation. At that stage, 
therefore, the plants require more rai:fal) than during earlier stages of develop- 
ment. 


12. After heading, the plant’s dependence on moisture and its sensitivity to 
tempcraiure decrease. Excessive rainfill still may damage the crop, however, by 
causing lodging (matting) and by promoting diseases—such as rust, scab, mildew, 
and leaf spot—as well as weed growth. As in the earlier stages of growth, excessive 
temperatures may be injurious, especially if aceomnanied by dry winds. Too lew 
a temperature preceding the harvest may delay ripening so that the crop is cavyhe ' 
by early frost. 


13, Links between weather tactors and the stages of growth cannot be deric :d 
with precision because of annual variations in sowing dates and in the seasonal 
pattern of growth among different grains and areas. For example, the same grain 
in the same area may be planted at subsiantially different times in different ycars, 
and separate varicties planted in the same region may grow at different zates. 


Technology 


14. Crop yields have increased steadily since 1958 in many regions uf the 
USSR but very slowly in other regions (see Table 1).5 The systematic improvement 
that has occurred probably reflects increased use of agricultural chemicals— 
fertilizer, lime, and insecticides—and better and more timely cultivation and 
harvesting. All of these changes are technologisal improvements. Unfortunately, 


‘These stages of u.veiopment refer to spring-sown small grains, Fall-sown winter grains 
usually tiller before entering dormancy. 

® Data for earlier years could not be used, because in 1958 the official measure of the 
grain crop is believed to have changed over to a new basis. Vor a discussion of this question, 
see Eberhard Schinke, “Soviet Agricultural Statistics” in Vladimir G. Treml and John P. Hardt 
(eds.), Soviet Economic Statistics, p. 242. 


Table 1 


USSR: Grain Yields 
Centners per Yectare 


Winter Wheat Spring Wheat 


All Grain 


Year USSR RSFSR Ukraine Kazakhstan USSR RSFSR Ukraine USSR RSFSR Kazakhstan 


1958 ..0..0........  TLed 10.6 16.8 9.5 16,2 17.5 17.6 9.7 9.9 9.6 
1959 ......,.0.... 10d 9.9 16.1 8.7 15.2 13.5 18.6 9.4 9.8 8.8 
INGO L. gseinie as 10.0 10.7 15.8 8.4 15.1 16.1 17.5 9.5 10.2 8.4 . 
19Gb oweciiae LOST 9.9 19.9 6.6 16.9 15.6 21.9 8.2 9.0 6.9 
T9GC 2 sic gi cagenaoe 10.9 11.0 17.9 6.5 16.8 18.6 17.6 8.2 9.2 6.5 
US Beas errr aa 8.3 8.3 12.9 4.4 12.9 13.4 14.2 5.9 7.0 3.8 
19G4 eee ee 11.4 10.7 17.7 9.8 13.8 12.9 17.0 9.9 9,9 9.9 
LOGS ica ek are 9.5 9.0 19.2 3.1 16.1 14.1 21.3 5.5 6.3 3.1 
1966............. 13.7 13.1 21.5 10.8 20.4 19.6 24.8 12.0 12.6 11.0 
LOG Tosatti et edkauds 12.1 11.9 20.5 6.3 17.8 15.7 23.1 8.9 10.5 6.1 
19GB... ee eee 140 14.7 18.5 8.4 18.3 19.0 20.5 12.2 14.4 8.3 
LOO Ses Sein Seaticecs 13.2 12.2 23.0 8.8 18.9 15.5 23.5 50.1 11.0 8.6 
LOT eis cer ovata 15.6 15.6 23.4 9.8 22.8 23.9 26.0 12.3 13.8 9.6 
EOE te Secret woes 15.4 14.6 25.4 pd 23.1 20.8 20.9 11.8 ts. 9.4 
197 2. sie cd wet ieee’ 14.0 12.5 21.2 12.5 19.6 10 7 25.5 13.0 13.1 12.9 
LOTS ois ascii eae 17.6 16.8 29.0 11.2 27.0 25.2 31.9 13.5 14.8 11,2 
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information on such improvements is incomplete and, when available, represent 
national or republic totals—not the individual crop egions required for the crop 
prediction model. 


id. As a surrogate for technological improvement, a time trend was tested 
in all crop regions and used where a high correlation between yields and time was 
found. In the districts where time was not correlated with crop yields (mainly 
the spring, wheat belt of Western Siberia and Kazakhstan), new higher yielding 
varictics have not been introduced, and normal precipitation is inadequate for 
efficient use of most types of fertilizer. 


F Availability of Data 


16. She weather data used in the prediction model include observations on 
precipitation and temperature for cach of 27 crop regions from 1961 to 1973." The 
observations are monthly averages except during the growing season, when aver- 
age precipitation readings are available for ten-day periods. Weather data are not 
available for crop regions 28-33, 


17. The yield data avaiiable cover all grain, spring wheat, winter wheat, 
winter rye, and spring barley for the years 1958-73. Yields were salculated from 
published daia for the oblasts within a crop region, except for those few crop 
regions that coincide with the areas for which the USSR reports yields. 


18. The weather data have their shortcomings. In the early years of caller 
tion—roughly 1961-65—data were transcribed Ly hand from weather maps, lead- 
ing () errors in transcription. Only the most obvious of these errors could be 
corrected, More immrortant, the observation for a given crop region is simply an 
unweighted average of the observations from all of the weather reporting stations 
in that region because of the lack of data on sown area below the oblast level. 
Thus, the weather observations in the major crop areas within a crop region were 
not accorded a proportionately greater weight. 


19. Errors in the yield data can arise from inconsistencies in official Soviet 
statistics, the need to estimate unpablished data, and the need to estimate the 
,atio of sown area to aggregate yiclds, Although many inconsistencies appear in 
published Soviet sourees, most invoive a variance of only 0.1 centner per hectare. 
Wheat yields for the RSFSR are usually reported only for spring and winter wheat 
combined. By using additional information, yields for spring and winter wheat 
could be estimated separately, The estimates that could be checked were quite 
accurate; hence, any distortion is believed to be small. For some crops, such as 
spring barley, yield data were available for cach division within a crop region but 
sown areas were unknown. The yield data therefore had to be aggregated by 
using estimated weights.7 


The Prediction Model 


Specification 


20. The prediction model assumes that grain yield is a linear function of 
technology and weather. A lincar time trend represents the influence of improved 
technology—for example, fallowing and increased use of fertilizers—and improved 


" Although weather data have been collected since 1958, complete series for all variables 
used in the model are available only since 1°81. The appendix describes the sources of weather 
and yield data. The 27 crop regions are identified on the map. 

7 These errors are discussed in more detail in the appendix. 
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prises a weather index that represents the influence of weather on yields, 
21, The Sovict grain yield prediction model is a set of 33 independent linear 
equations: 
(1) Y*y=c*ay boty Ty tet Vy 
where Y*=predicted yicld, 
T=time, 
V*:=the weather index, 
c*.5, #1, and c*s, are usiimated model parameters, 
i=year (1958 to 1974), and 
j==crop region (1 to 33). 
Estimates of sown area fer the 33 crop regions are multiplied by the yield 
estiinates for the crop regions to obtain forecasts of total production ond the 
average yield. The estimates of sown area are based on Soviet press reports and 
past trends in each crop region. 
22. The eleven weather variables used to form the weather index are: 
@ accumulated precipitation in millimeters from October through March; 
@ monthiy precipitation in millimeters for April, May, June, July, and 
August; and 
@ average monthly temperatures in degrees centigrade for April, May, 
June, July, and August. 
Because weather data are not available for crop regions 28 through 33, cy; in 
equation (1) assumes the value of zero in these six crop regions. The weather 
variables listed above were selected from a sct of 54 standard weather variables 
now available for weather-yield regression work on the USSR (see the appendix). 
In addition, new variables can be formed by combining standard variables. 


Estimation 

23. We estimated the Soviet grain yield prediction model in four steps. 
First, we estimated the effect of a time trend on yicld in cach of the 33 crop 
regions. We then estimated the influence of the weather variables, constructed 
the weather index, and finally estimated the paranicters, c*,), ¢*1), and c*sy, in 
equation (1). 

24, In step one, we estimated the parameters ayy and ay shown in equation 
(2). We derived these estimates by performing linear regressions for the 33 
time-yield equations: 

(2) YyF=agtay Tytey; 

where Y,;=reported yicid, 
i=year (1958 to 1973), 
j=crop region (1 to 33), and 


ey=an unexplained residual. 


The calculated yields and reviduals from equation (2) are 


(3) Y**)=a*,jta*y, T*,; 
(4) CF yy 
6 
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The escimated time trend cocfficient (ay) was not significantly different from 
zero for many eastern: grain-growing, regions. ‘Thereiore, the value of ey) in 
equation (1) is set at zero for these regions.4 


25. In the second step, we regressed the residuals calculated in’ equation 

(4) above on JL weather variables to obtain an index of the influence of 

weather. “he available weather data for cach crop region are limited to 13 

observations (1961-73) for crop regions 1-27, Therefore, in order to have enough 

; observations to estimate the weather index, we arranged these 27 crop regions 
into 13 groups. We then computed the following regression equation: 


1] 
(5) CF ym= Banh 2 Dan Wir Uj 
=! 


where e* yy. the estimated residual from equation (4) for year i in 
crop region j—which is assigned to one of the 13 
regional groups (m), 
Wham = weather variable k (k= 1,..., 11), and 


Uy = an unexplained residual, 


26. The aggregation of the data into 13 groups of crop regions is called 
pooling the data.” The use of a single regression equation to express the relation- 
ship of yield to weather variables for a multi-region area implicitly assumes that 
the weather variables have the same influence on yield in cach of the regions 
grouped, Thus, it was necessary to select group ngs of crop regions with similar 
climates, soils, and cultivation practices. 


27. In the third step, we calculated the weather index V*y for each crop 
region, Given the parameters D* yn (k=0,..., 11) estimated from equation (5), 
the weather index was computed as: 


11 
(6) V* y= D¥ amt D3 ee Wry 
k=1 


where V*\= weather index for crop region j in year i, 
Wyaj= weather variable k for crop region j in year 3, 
and all 27 crop regions (j) are assigned to one of 13 
regional groups (m). 

28. In the fourth step, we estimated the coefficients ci, ci, and Cay 
(j=1,...,33) in the model’s grain yield prediction equation (1) by perfcrming 
multivariate lincar regression using equation (7) and the values for V*\, estimated 
in equation (6). 


(7) Yy=cyte, ; Tyee, Vyt uy 
The Pattern of the Regression Zoefficients 


29. The signs of the estimated coeffitients of the weather regression— 
equation (5)—are shown in Table 2. The coefficients for October-March pre- 
cipitation are generally positive (16 of 27 crop regions). Preseason precipitation 
wonld be expected to have a positive effect on yield, especially in the winter 
grain areas. Nenetheless, many of the negative coefficients are in winter grain 


"Crop regions 6, 10, and 20-27. 

* Pooling data is the use of cross-section data with time serigs data. Pooling the data ° 
increases the number of observations used in estimating the paramcters in each equation and 
hence increases the number of degrees of freedom. 


7, 
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Table 2 


Coefficient Signs of Weather Variables! 
Precipitation Temperature 


Regions Oct-Mar 9 Apr May — Jun Jal Aug Apr May — Jan Jul Aug 


reas es { \ | 

Bice aos t ' NvA N.A N.A ' { NLA, NA N.A 

1,5, 6 ' 1 t ' 

LigiMatg ion eas t t ' ' 

9, 10, { 1 ' ! t t j 

AG vies es 4 1 1 t { ' ; 

13, bdo... t t 1 \ 

15, 18 { 4 f i 

1. Wiajceecce t 1 ! 4 t t ' 

Ks sardon vite ' NvA NLA NLA, 1 t NA NA NAL 

20, 26, 27 1 : { 1 
t 


' 

' 
\ ' { 
t ' 
IS Bhi we ane bx : ' ' 
These coefficients resulted from using the 1 weather varinbles enumerated with the grouping 
of crop regions given, Regions 3 and 17 were estimated separately with a reduced set of weather 


variables, 


areas. Precipitation seems to be important in April, May. and June, with the 
coefficients generally positive. The coefficients for April temperature are mostly 
positive, but several turn negative in May and June, suggesting that very hot 
weather at the onset of heading is aot desirable. The need for dry weather at 
harvest time clearly shows up in the winter grain areas in July, as most precipita- 
tion coefficients shift from positive to negative. The spring grain areas retain 
their positive coefficients in July, but two of them become negative in August. 


30. The results presented in Table 2 are generally encouraging, although 
the signs do not correspond with agronomic theory in several instances. Dis- 
crepancics could arise for a number of reasons: 


e Although weathcz and time were assumed to be unrelated, in fact 
weather may be cyclical. Since 1961, precipitation in the USSR has 
tended to increase and average temperature has declined. slightly, 
affecting the estimated time trend in equation (2) and biasing the 
weather variable coefficients. 


e Weather variables also may be related within the same year, contrary 
to the model’s assumption that the weather variables in equation (5) . 
are independent. Colinearity of this nature could casily cause a sign 
reversal in the weather coefficients. 


e Several of the weather variable coefficients were statistically insignifi- 
cant.” There are three likely causes: (1) a variable truly is not re- 
lated to yield, (2) a variable is colinear with another weather variable 
and therefore does not receive its full eredit,"! and (3) the small number 
of observations precludes an accurate test of significance. 


"A coefficient is statistically insignificant if it cannot be stated with a high probability 
(usually 95% ) that the true value of the coefficient is non-zero. 

"A number of tests were made by reestimating equation (5) after excluding one or 
more insignificant weather variables. These tests did not show any important sign reversals, 
but did sharply improve the significance of the remaining variables, This indicates the presence 
of colinearity among the weather variables, 


8 
Approved For Release 2000/09/14 : CIA-RDP86T00608R000500230008-5 


a er eee a 


Approved For Rajeage BP00/09/14 CIA-RDP86T00608R000500230008-5 


mpes of grain grown within each crop region and in the nation 
change over tine because of weather, shifts in demand, and technologi- 
cal change. The coefficients of the weather equation reflect’: changes 
in the structure of the sown area and therefore are unlikely to apply 
fully in any particular year. 


PREDICTIONS FOR 1974 


31. The use of the model to forecast grain yields began in April and con- 
tinued throughout the summer months. The forecasts were updated every ten 
days, as new weather data were received. Two basic types of forecasts were 
made during the 1974 crop year. The first used only weather variables for 
which 1974 data had been received. For example, the forecast made as of the 
end of April did not include the influence of weather in May, June, July, and 
August. The second type used ail weather variables, substituting long-run norm 
values for variables not yet reported. The results reported here are al! from the 
second type of forecast. 


32. In 1974 the estimate of grain production ranged from 184 to 203 million 
tons, A preseason estimate of 185.9 rllions tons was based on Jongrun norm 
values for all weather variables. The tabulation below shows how the predic- 
tions changed as additional weather data were received, 


Predicted Yield Predicted L974 

(Centners per Grain Production 
Weather Through Hectare) (Million Tons) 
Preseason ....0 000000000 00 cee ee eee ee. 14,5 185.9 
March oo. ce eee eee. 146 186.3 
April oo ......0... Se witiiyetmerinnn Cae 184.2 
May oc sei cides intra den uations TAS 189.9 
JUNG: Sere Hs baler ag dh aes idee Ee oe ee AOD 703.3 
July oo... yh aohinanaiinowes, Glksnt dia enk ang ae LOD 197.8 
AUMUSE si Sbciea neh Ganen ys bb hed Ne cee as bey 15.6 199.4 


The development of the harvest prediction becomes clearer when republic 
estimates are examined. The gains made in Kazakhstan through April were more 
than canceled out by poor weather in the rest of the country, In May, Kazakhstan 
fared poorly, but developments in the Ukraine easily compensated for Kazakh- 
stan’s diminished prospects and provided the first hope for a good year. In June, 
predicted yields in the Ukraine centinued to increase, and those in the RSFSR 
made a very large jump as a result of abundant rains. Predicted harvests in all 
areas fell off in July and recovered slightly in August. The final line in the 
tabulation of predicted 1974 grain yields below shows the production as reported 
: in December 1974. The predicted values for the RSFSR and the Ukraine are 
3.7 million and 1.9 million tons too high. Since this is more than the difference of 
3.9 million tons for the USSR prediction, the prediction for Kazakhstan and other 
areas must be too low. 


Millions Tons 


Other 

Weather Through RSFSR Ukraine Kazakhstan Areas Total 
Preseason 2.0... 109.4 39.6 19.8 17.1 185.9 
March ....... 000.0000. 109.2 40.3 19.8 17.0 186.3 
April... 00.0000... .... 107.8 39,1 21.8 15.5 184.2 
May. coiciecn keteoticatea ues 198.8 46.0 19.1 16.9 189.9 
1 (0 ( oa pee een ono 116.3 48.6 21.0 17.4 203.3 
July he hy Rea el 114.0 47.1 20.5 16.2 197.8 
August .......00..0000, 115.3 47.9 19.5 16.7 199.4 
Keported production .... 111.6 16.0 N.A. N.A. 195.5 
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33, Tabulations of the progress of the predictions for winter and spring. 
Wheat indicate that winter wheat caused the increase in the predicted grain crop 
in May, and spring wheat caused the large jump in June. 


WINTER WITEAT 
Million ‘Tons 


Weather Through RSVSKR Ukraine Other Areas Total 
Preseason 0... i ez te 12 20.1 5.6 419 
Mitel? exe Ae tetans BA co's eins 15.7 20.1 5.6 4! 

April ........ cre ewe bate LAD 20.3 54 40.0 
Mayo oo. 00 cece eee eee Y5B ' 99.9 57 AAA 
June Sed i Jui tin PM detente GS 23.4 5.9 45.6 
ily cee tae ta aett eine 14,4 22:9 5.7 43.0 


SPRING WHEAT 


Million Tons 


RSFSR Kazakhstan Other Areas Total 
Preseason... . stine rns rn | 12.3 0.1 445 
March iP, Gh * het heat ras 2.35 oe. 12.4 0.1 44,7 
April... Soars det bt ead Boake hens, OG 13.1 0.1 45.8 
Mavi. ow... me eck e BoA 1L.G 0.1 Adel 
June oo. eee. BOL 12.6 0.1 48.9 
Wily oes, 6 RS Metae deem toda 35.9 11.6 0.1 47.6 
August .2.................. 36.6 10.6 0.1 47.3 


EVALUATION OF THE MODEL 


34. The value of the model was tested by computing its prediction errors 
using historical data and by comparing its performance with other models. First, 
we deleted data and reestimated the model coefficients from the remaining 
data.!* We predicted crop yields from equation (1) for cach of the 33 crop regions. 
Next we computed an average USSR yield using reported sown areas. We then 
computed the difference between our predicted USSR yicld and the yicld re- 
ported by the USSR Central Statistical Administration. The following tabulation 
shows the direction and percentage magnitude of the prediction errors, i.e., 


(predicted yield —reported yield) 


reported vicld sa 
Yelits Covers Percentage Prediction Errors for All Grain Yields 

in Model 

Estimation 1968 1969 1970 1971 1972 1973 1974! 
1958-67 .... —10.0 4.5 —7.2 -11.8 —-1.6 - 13.3 —10.1 
1958-68 .... — —-6.3 0.4 —-13.2 —1.6 —14.5 —- 59 
1958-69 .... — —_ 3.4 —- 9.0 6.2 — 10.0 —- 42 
1958-70 .... _ —_— —_— — 86 3.8 — 7.6 - 08 
1958-71 .... _— —_ — — 8.2 - 9.2 2.5 
1958-72 .... _— _ _ -— — - 81 1.2 
1958-73 .... —_ _ — — — — 2.0 


‘Total sown area has not been reported for 1974; the estimated yield is based on an 
estimated sown area of 128 million hectares. 


2 This procedure resembles the way the modei would be used in practice. As data for 
additional years become available, the model would be reestimated. 
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This test suggests that an error of between 2% and 80 should be expected— 
the range of the precentaye errors shown on the diagonal for the years 1969-74, 
While the upper part of this range is too high from the shindpoint of making use- 
ful forecasts," experionce has shown that collateral information can be used to 
detect anomalies and reduce the error, 


35. Next we constructed) several naive models to determine whether our 
model provides better predictions. Even the best of these naive models did 
not perform well cnough to merit consideration. In the best of the naive models, 
yield was a linear function of time, past yields, and selected weather variables, 
The data for all crop regions for which weather data were available (regions 
: 1-27) were used in the same equation. This equation was able to. describe 
historical yields fairly well (the coefficient of determination was 0.8), but it 
did not capture the cyclical behavior of Soviet agriculture, and the averase 
absolute percentage prediction error was almost double that of ovr disaggregated 
model, based on predictions for 1969-72. 


36. In evaluating the model described in this publication (and especially the 
predictions for 197-4), the fact that production forecasts depend on the accuracy 
of sown areca must be kept in mird, Although actual sown areas have not been 
reported for 1974, the errors in estimates of total sown areas in 1973 ranged 
from 2.1% for all grain to 13.100 for winter wheat, as shown below. 


Million Hectares 


Estimates Used in Percentage 
Crop Actual Prediction Model Error 
All grain... wee... 126.7 124.0 e2)) 
Spring wheat . J... 44.8 47.0 4.9 
Winter wheat) ........ 18.3 15.9 -13.1 


Sown area data by crop region in 1973 have been received only for the Ukraine, 
A comparison of actual sown areas in the Ukraine in 1973 with those estimated on 
the basis of laborious review of published information also shows substantial 
errors, as follows: 


All. Grain Winter Wheat 
Million Hectares Perenntaee Million Hectares Pereent axe 
Crop Region Actual Estimated Error’ Actual — Estimated Error ' 
Total Ukraine? ..... 16.6 15,2 — 88 8.3 7.0 —-16.4 
Western Ukraine .. 9 2.5 2.3 —- 6.9 1.2 1.2 -— 3.3 
North Central 
Ukraine 1.1.0.0... 445 4.2 — 6.6 2.1 2.0 ~ 39 
Northeast Ukraine . 2.6 2.4 -~ 74 1.0 1.2 '4.9 
. Eastern Ukraine .. 3.6 3.3 - 9,2 1.9 1.0 -AT9 
Southern Ukraine .. 3.5 3.0 -13.5 2:2: 1.7 —22.8 


* Percentages were derived from unrounded d-ta, 
? Because of rounding, components may not aad to the totals shown. 


Knowledge of the actual distribution of the 1973 Ukrainian sown area would not 
have greatly improved the estimate of the overall Ukrainian yield (from 28.76 
to 28.82 centners per hectare for winter wheat), but the error in the predicted 
total harvest would have been sharply reduced (from —24.8% to —9.7% for 
winter wheat). Thus, the effect on crop estimates of errors in the estimates of the 
regional distribution of sown area appear to be much less than the effect of 
crrors in estimates of the total sown area. 


“An error of 8% applied to a national yield of 15 centners per hectare and a sown area 
of 128 million hectares, results in a harvest error of 15 million tons, 
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© model “described: in ‘ution presumably can be 


improved as data on weather and yields accumulate, it must be used with care. 
The essence of the model is a set of equations which estimate a linear ,clation- 
ship between the yield of a given crop and a number of weather variables and a 
time trend, There are several problems inherent in the model; 


The time trend is used as a surrogate for factors that the model, because 
of Tack of data, cannot account for specifically. 


In any event, since the time trend averages the long-term effect of 
other variables, it will not reflect their influence accurately in’ any 
given year, The impact of a sharp increase in the supply of fertilizer 
ina given year, for example, would not be reflected in the results of the 
model. 


The wei ther indexes measure the influence of weather variables in 
an “average” year, but each year is different with respect to the 
timing of crop developments. 


The assumed linear influence of the weather variables undoubtedly 
fails to capture the effects of extreme weather accurately, 


38. The range of the expected prediction error can probably be narrowed 
by introducing nonlinear variables, additional experiments with the grouping of 
crop regions, or other weather variables. Aico, the data base will gradually in- 
crease, adding to the number of observations available to estimate the parameters 
of the weather variable. Despite these improvements, it will still be necessary 
to consider collateral information about variations in the crop calendar, harvesting 
conditions, use of fertilizer, and amounts and quality of seed available and to 
adjust the model predictions accordingly. 
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APPENDIX 


DESCRIPTION OF THE DATA BASE 


Weather Data 

Four basic types of weather data covering the period from 1961 to the 
present are currently maintained ex computer files for use in weather-yicld 
regressions—precipitation by decade ! for the months of April through September, 
total precipitation by month, average monthly temperature, and soil moisture as 
of the last day of each month. 


Sources and Reliability 

Basic precipitation and temperature data for individual weather stations in 
the USSR are broadcast two to four times daily by Sovict radio stations? From 
these broadcasts, weather data are interpolated for some 134 grid points spaced 
roughly 75 miles apart in a square pattern throughout the area studied. 
Periodically, total precipitation, mean temperature, and soil moisture are calcu- 
lated for cach grid point and aggregated into valucs for crop districts. 

Reliability of the weather data is primarily a function of the density of 
coverage—that is, the number of individual weather stations from which reports 
emanate affects the interpolated data for a given grid point. The density of 
coverage is most critical for the precipitation values because there can be con- 
siderable variation in precipitation over a relatively small area, especially in 
hilly country. Temperature valucs, on the other hand, are considered to be 
more continuous. 


Before April 1965 the density of coverage averaged one reporting weather 
station per grid point, largely because of limitations imposed by the time- 
consuming manual method of interpolation used. Since that date, computer 
analysis has applied a complex method of interpolation that permitted increasing 
the density of coverage to four or five reporting weather stations per grid point. 
Consequently, the interpolated values for precipitation and temperature since 
April 1965 are considerably better indicators of weather conditions at the grid 
points than those for earlier years. 


Storage and Handling of Weather Data 


The basic weather data are received at approximately ten-day intervals 
and stored on a computer disk file. Six categories of weather variables are 
stored in the file in the following order: 


1. First decade precipitation 

2. Sccond decade precipitation 

3. Third decade precipitation 

4. Total monthly precipitation 

5. Average monthly temperature 

6. Soil moisture (last day of month) 


1Each month is divided into three periods called “decades.” 
* As a member of the World Meteorological Organization, the USSR shares such inforima- 
tion with foreign countrics. 
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Hach of the three decade categories contains only six variables (representing the 
meaths of September and April through August) while the remaining: three 
categories contain 12 variables cach (one for each month of the year). A complete 
list of the weatlier variables is shown in Table A-1, 


TABLE A-1 


STANDARD WEATHER VARIABLES AVAILABLE 


Number 


. Precipitation, 


. Precipitation, 


. Precipitation, 
. Precipitation, 
. Precipitation, 
. Precipitation, 


. Average 
. Average 


.. Soil Moisture 


. Soil 


——_._.., 


Name of Variables 


Decade 
Decade 
Decade 
Decade 
Decade 
Decade 
Decade 
Decade 
Decade 
Decade 
Decade 
Decade 
Decade 
Pecade 
Decade 
Decade 
3d Decade 
3d) Decade 
Precipitation—September 
Preecipitation—October 


Ist 
Ist 
Ist 
Ist 
Ist 
Ist 
2d 
2d 
2d 
2d 
ad 
2d 
3d 
3d 
3d 
3d 


Prec. pitation, 
Precipitation, 
Precipitation, 


Precipitation, 
Precipitation, 
Precipitation, 
Precipitation, 
Precipitation, 
Precipitation, 


Precipitation, 
Precipitation, 
Precipitation, 


. Precipitation—November 
. Precipitation—December 
. Precipitation—January 

_ Precipitation—February 
. Precipitation—March 

. Precipitation—Aprii 

. Precipitation—May 


Precipitation—June 
Precipitation—July 


. Precipitation—August 
. Average Temperature 


Average Temperature 
Temperature 
Average Temperature 
Average Temperature 
Average Temperature 
Average Temperature 
Average Temperature 
Average Temperature 
Average 
Average Temperature 
Soil Moisture (Lact Day 
Soil Moisture (Last Day 
(Last Day 
Soil Moisture (Last Day 
Soil Moisture (Last Day 
Soil Moisture (Last Day 
Soil Moisture (Last Day 
Soil Moisture (Last Day 
Moisture (Last Day 
Moisture (Last Day 
Moisture (Last Day 
Moisture (Last Day 


(Absolute 
(Absolute 
Temperature (Absolute 
(Absolute 
(Absolute 
(Absolute 
(Absolute 
(Absolute 
(Absolute 
(Absolute 
Temperature (Absolute 
Absolute 


of September 
of April 

of May 

of June 

of July 

of August 
of September 
of April 

of May 

of June 

of July 

of Augut 

of September 
of April 

of May 

of June 

of July 

of August 


Value )—September 

Value )—October 

Value )—November 

Value )—December 

Value )—January 

Value )—February 

Value )—March 

Value )—April ‘ 
Value )—May 

Value )—June 

Value )—July 

Value )—August 
Month )—September 
Month )—October 
Month )—November 
Month )—December 
Month )—January 
Month )—February 
Month )—March 
Month )—April 
Month )—May 
Month )—June 
Month)—July 
Month )—August 


of 
of 
of 
of 
of 
of 
of 
of 
of 
of 
of 
of 


the 
the 
the 
the 
the 
the 
the 
the 
the 
the 
the 
the 
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The data bases for harvest and sowi area heve been assembled for five 
vrain crops: (1) all grain, (2) winter wheat, (3) spring wheat, (4) spring 
barley, and (5) winter rye, These data bases are computerized and will be ve- 
ferred to as a “file.” Each file has a common structure; all are processed by the 
same computer program, The data sources are not the same, however, and some 
data have been estimated. 


aay Data Required 


1, For each crop, data are required for all 33 crop regions. The 33 crop regions 
are composed of 127 administrative divisions—12 Sovict Socialist’ Republics 
(SSRs), 16 Autonomous Soviet Socialist: Republics (ASSRs), 93> oblasts, and 
6 krays. For each of the 127 divisions, data on the size of the harvested crop in 
- thousands of tons and the harvested area in thousands of hectares are needed. 

The computer program computes yields for each of the 127 divisions in contners 
“ per hectare, sums the harvested crop and the harvested area for ali of the divisions 

within a crop region, aud computes the yield for each crop region. 


Data Sources 


All Grain: Data on productiow: and sown area for all grain are usually avail- 

able from Sovict statistical handbooks published for the area and year of interest, 

Although data for Kazakhstan in 1962-64 and 1969-70 are missing, five-year 

moving averages for spring wheat in selected Kazakhstan oblasts were recently 

published. Since spring wheat is dominant in these areas, these data were used 

a to estimate yields for 1962-64 and 1969-70 on th assumption that spring wheat 
viclds were the same as for all grain. 


No other all grain data are estimated except in the sense that the computer 
program computes the yields, Sometimes this causes a variation of 0.1 centner i 
per hectare from the yields published in Sovict statistical handbooks. 


Spring and Winter Wheat: Data for Estonia, Latvia, Lithuania, Belorussia, 
Moldavia, Kazakhstan, and the Ukraine are taken from various statistical abstracts. 
The statistical handbooks for the RSFSR normally publish data for all wheat only. 
= In order to estimate data for winter and spring wheat separately, additional 

sources and estimation methods had to be used. Several estimation procedures 
were used, depending on the data available. 


MS The best estimates are for the years 1960, 1962, and 1963. The only data 
missing are the harvested crop by administrative division, which can be estimated : 
by, 
(1) Harvest = Area X Yield 


for cach division. To adjust for crrors introduced by the use of rounded data, 

published data on spring and winter wheat by economic region were used. The 

actual harvested crop for an economic region should equal the sum of the calcu- 

lated harvested crops for the divisions within that region, and the published ' 
data on total wheat for each division should equal the sum of the calculated 

winter wheat harvest and the calculated spring wheat harvest. The calculated 

; harvest data were adjusted to meet these conditions while remaining within the 

range of possible rounding error. 


The estimates for 1958 and 1971 use essentially the same method. For 1958, 
exact data on the harvested crop by economic region are not available. It was 
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estimated using data on area and yields by economic region. Hence the adjust- 
ments to the calculated harvest are not as aceurate, For J971, data on the 
harvested area of winter and spring wheat are not available by economic region or 
by administrative division, Data on the harvested crop are available; hence 
equation (1) can be rearranged as 


(2) Area= Harvest + Yield 


in order to estimate sown areas, The estimated sown uucas were then refined using 
the same type of check as described above. 


For 1965-69, cata for spring or winter wheat have not been reported by 
administrative division, for the harvested crop or the harvested area. Equations 
(3) and (4) were solved simultancously to estimate these data for each ad- 
ministrative division. 


(3) I=(YS)X(AS)+(YW)x (AW) 
(4) A=AS+AW 


where: 
I= total wheat harvest 
YS= yield of spring wheat 
YW= yield of winter wheat 
AS= sown area of spring wheat 
AW= sown area of winter wheat 
A= total sown area of wheat 


H, A, YS, and YW are known. The solution is: 


(5) _I-(A)X(YS) 
INNS YW-YS 
(6) AS=A—AW 


Because the denominator of equation (5) is the difference of two numbers of 
similar magnitudes, these estimates are susceptible to large errors. The refine- 
ment procedures discussed above were also used for these calculations, but had 
a much greater impact because larger errors were encountercd. 


In order to gauge the accuracy of this method, equations (5) and (6) were 
computed for all years 1958-69. Direct comparisons with published data for 1958, 
1960, 1962, and 1963 showed that as long as the difference between spring and 
winter wheat yieclds—the denominator of (5)—was not close to zero, the estimates 
were quite good. The refinements using control totals for economic regions and 
the relative stability over time of sown areas provide additional confidence in the 
estimates, 


The situation for 1961 and 1964 is similar to that for 1965-69, except that con- 
trol totals for harvested area by economic region are not available. But the har- 
vested crop and yicld together imply an estimated harvested area for economic 
regions that is sufficiently accurate to provide usable refinements of estim ites 
for administrative divisions. For 1959, there are data by economic region for 
harvested area but not for harvested crops. As for 1961 and 1964, crops can be 
estimated using data on yields and hrrvested area by economic region. 


Data for 1970 were the hardest to estimate. Data for winter and spring wheat 
have not been reported separately by oblast, only for total wh & But the source 
of data for 1971 also gives 1971 results as a percent of the 1966-70 average. From 
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this information, the average yields and average harvested crops could be calcu- 
lated for 1966-70, The estimates for 1970 are then given by 


TAM) 
(7) Y= 5Y~ Dy Y; 
sg 
MiITL 
Ay=5A— >> Ay 
1068 


where; 
Y;=yield in year i 
; Aj= sown area in year i 
Y=average yield in 1966-70 
A=average sown areca in 1966-70 


These estimates were then refined by checking them against economic region 
data and against data for total wheat by administrative division, In most cases 
the estimates did not have ts be substantially revised. 


Spring Barley: Data on spring barley are not published consistently. The 
statistical handbooks for the Ukraine regularly publish data by oblast, but other 
handbooks normally do not. The data for the Baltic districts, Belorussia, and 
Moldavia come mostly fiom the agricultural handbook Sel’skoye  ozyaystvo 
(1970) and are directly available as yields in centners per hectare. Data for 
Kazakhstan are not available for the years 1962-64 and 1969-71. 


Data for the RSFSR are available at the administrative division level by 
yield only for the years 1958-69. Weights were estimated in order to combine 
the yields of the divisions within a crop region. They are based on data on 
sown area derived from a Soviet journal article and from a Sovict agricultural 
atlas.* The only data available for 1970 and 1971 are for economic regions. They 
are usable for two regions, the Central and the Volga-Vyatsk, which are very 
close in definition to crop districts 13 and 14. The weighted averages for cach 
crop district were computed and the results entered in the place of one of the 
divisions in cach crop district. 


Winter Rye: Data by oblast for the RSFSR, which has 7597 of the area sown 
to winter rye, have been published only through 1969. Oblast data for the Ukraine 
(less than 10% of the total area sown to winter rye) are available only since 1965. 
Reasonable estimates for 1958-64 can be made at the crop region level, however, 
from the data for economic regions which are published annually. Various 
republic handbooks and the USSR handbook supply the data for the semaining 
crop regions. 


*Zerroveye khozyaystea, No, 3, 1972, p. 17-20 and Atlas sel’skozo khozyaystea SSSR, 
, Moscow, 1960, 
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